825 research outputs found

    Spectral Style Transfer for Human Motion between Independent Actions

    Get PDF
    Human motion is complex and difficult to synthesize realistically. Automatic style transfer to transform the mood or identity of a character's motion is a key technology for increasing the value of already synthesized or captured motion data. Typically, state-of-the-art methods require all independent actions observed in the input to be present in a given style database to perform realistic style transfer. We introduce a spectral style transfer method for human motion between independent actions, thereby greatly reducing the required effort and cost of creating such databases. We leverage a spectral domain representation of the human motion to formulate a spatial correspondence free approach. We extract spectral intensity representations of reference and source styles for an arbitrary action, and transfer their difference to a novel motion which may contain previously unseen actions. Building on this core method, we introduce a temporally sliding window filter to perform the same analysis locally in time for heterogeneous motion processing. This immediately allows our approach to serve as a style database enhancement technique to fill-in non-existent actions in order to increase previous style transfer method's performance. We evaluate our method both via quantitative experiments, and through administering controlled user studies with respect to previous work, where significant improvement is observed with our approach

    RAID: A relation-augmented image descriptor

    Get PDF
    As humans, we regularly interpret scenes based on how objects are related, rather than based on the objects themselves. For example, we see a person riding an object X or a plank bridging two objects. Current methods provide limited support to search for content based on such relations. We present RAID, a relation-augmented image descriptor that supports queries based on inter-region relations. The key idea of our descriptor is to encode region-to-region relations as the spatial distribution of point-to-region relationships between two image regions. RAID allows sketch-based retrieval and requires minimal training data, thus making it suited even for querying uncommon relations. We evaluate the proposed descriptor by querying into large image databases and successfully extract nontrivial images demonstrating complex inter-region relations, which are easily missed or erroneously classified by existing methods. We assess the robustness of RAID on multiple datasets even when the region segmentation is computed automatically or very noisy

    SMASH: Data-driven Reconstruction of Physically Valid Collisions.

    Get PDF
    Collision sequences are commonly used in games and entertainment to add drama and excitement. Authoring even two body collisions in real world can be difficult as one has to get timing and the object trajectories to be correctly synchronized. After trial-anderror iterations, when objects can actually be made to collide, then they are difficult to acquire in 3D. In contrast, synthetically generating plausible collisions is difficult as it requires adjusting different collision parameters (e.g., object mass ratio, coefficient of restitution, etc.) and appropriate initial parameters. We present SMASH to directly ‘read off’ appropriate collision parameters simply based on input video recordings. Specifically, we describe how to use laws of rigid body collision to regularize the problem of lifting 2D annotated poses to 3D reconstruction of collision sequences. The reconstructed sequences can then be modified and combined to easily author novel and plausible collision sequences. We demonstrate the system on various complex collision sequences

    Softmesh: Learning Probabilistic Mesh Connectivity via Image Supervision

    Get PDF
    In this work we introduce Softmesh,a fully differentiable pipeline to transform a 3D point cloud into a probabilistic mesh representation that allows us to directly render 2D images. We use this pipeline to learn point connectivity from only 2D rendering supervision,reducing the supervision requirements for mesh-based representations.We evaluate our approach in a set of rendering tasks,including silhouette,normal,and depth rendering on both rigid and non-rigid objects. We introduce transfer learning approaches to handle the diversity of the task requirements,and also explore the potential of learning across categories. We demonstrate that Softmesh achieves competitive performance even against methods trained with full mesh supervision

    Search for Concepts: Discovering Visual Concepts Using Direct Optimization

    Get PDF
    Finding an unsupervised decomposition of an image into individual objects is a key step to leverage compositionality and to perform symbolic reasoning. Traditionally, the this problem is solved using amortized inference, which does not generalize beyond the scope of the training data, may sometimes miss correct decompositions, and requires large amounts of training data. We propose finding a decomposition using direct, un-amortized optimization, using a combination of a gradient-based optimization for differentiable object properties and global search for non-differentiable properties. We show that using direct optimization is more generalizable, misses fewer correct decompositions, and typically requires less data than methods based on amortized inference. This highlights a weakness of the current prevalent practice of using amortized inference that can potentially be improved by integrating more direct optimization elements

    Detecting the point of release of virtual projectiles in AR/VR

    Get PDF
    Our aim is to detect the point of release of a thrown virtual projectile in VR/AR. We capture the full-body motion of 18 participants throwing virtual projectiles and extract motion features, such as position, velocity, rotation and rotational velocity for arm joints. Frame-level binary classifiers that estimate the point of release are trained and evaluated using a metric that prioritizes detection timing to obtain a ranking of joints and motion features. We find that wrist joint and rotation motion feature are most accurate, which can can help when placing simple motion tracking sensors for real-time throw detection

    Preface

    Get PDF

    PATEX: Exploring Pattern Variations

    Get PDF
    Patterns play a central role in 2D graphic design. A critical step in the design of patterns is evaluating multiple design alternatives. Exploring these alternatives with existing tools is challenging because most tools force users to work with a single fixed representation of the pattern that encodes a specific set of geometric relationships between pattern elements. However, for most patterns, there are many different interpretations of its regularity that correspond to different design variations. The exponential nature of this variation space makes the problem of finding all variations intractable. We present a method called PATEX to characterize and efficiently identify distinct and valid pattern variations, allowing users to directly navigate the variation space. Technically, we propose a novel linear approximation to handle the complexity of the problem and efficiently enumerate suitable pattern variations under proposed element movements. We also present two pattern editing interfaces that expose the detected pattern variations as suggested edits to the user. We show a diverse collection of pattern edits and variations created with PATEX. The results from our user study indicate that our suggested variations can be useful and inspirational for typical pattern editing tasks

    Plausible Shading Decomposition For Layered Photo Retouching

    Get PDF
    Photographers routinely compose multiple manipulated photos of the same scene (layers) into a single image, which is better than any individual photo could be alone. Similarly, 3D artists set up rendering systems to produce layered images to contain only individual aspects of the light transport, which are composed into the final result in post-production. Regrettably, both approaches either take considerable time to capture, or remain limited to synthetic scenes. In this paper, we suggest a system to allow decomposing a single image into a plausible shading decomposition (PSD) that approximates effects such as shadow, diffuse illumination, albedo, and specular shading. This decomposition can then be manipulated in any off-the-shelf image manipulation software and recomposited back. We perform such a decomposition by learning a convolutional neural network trained using synthetic data. We demonstrate the effectiveness of our decomposition on synthetic (i.e., rendered) and real data (i.e., photographs), and use them for common photo manipulation, which are nearly impossible to perform otherwise from single images
    corecore